Mining of Rare Itemsets in Distributed Environment
نویسندگان
چکیده
The mining of rare itemsets involves finding rarely occurring items. It is difficult to mine rare itemsets with a single minimum support (minsup) constraint because low minsup can result in generating too many rules in which some of them can be uninteresting [3]. In the literature [4, 5], "multiple minsup framework" was proposed to efficiently discover rare itemsets. However, that model still extracts uninteresting rules if the items' frequencies in a dataset vary widely. In this paper, we are using the notion of "item-to-pattern difference" and multiple minsup based FP-growth-like approach proposed in [6] to efficiently discover rare itemsets in the distributed environment. To discover global rare itemsets in distributed environment, information regarding itemsets of local sites is collected in the form of MIS-tree at one site; that is, each site sends its local MIS-tree to a single site where a global MIS-tree will be constructed from all the MIS-trees received from all the sites. This global MIS-tree is mined to generate global rare itemsets. Experimental results show that this approach is efficient in terms of communication bandwidth consumed.
منابع مشابه
A Novel Algorithm for Mining Rare-Utility Itemsets in a Multi-Database Environment
Utility mining has recently been an emerging topic in the field of data mining. It finds out high-utility itemsets by considering both the important factors of profit and quantity. In some situations, rarely occurring items may co-occur in a relatively close relationship with specific high-utility items. These utility itemsets with rare items may provide useful information to decision makers as...
متن کاملA Distributed Approach to Extract High Utility Itemsets from XML Data
This paper investigates a new data mining capability that entails mining of High Utility Itemsets (HUI) in a distributed environment. Existing research in data mining deals with only presence or absence of an items and do not consider the semantic measures like weight or cost of the items. Thus, HUI mining algorithm has evolved. HUI mining is the one kind of utility mining concept, aims to iden...
متن کاملMining of High Utility Itemsets in Service Oriented Computing
Service Oriented Computing which use Knowledge as a service makes the use of Utility Mining approach. Here, we have proposed an architecture called Knowledge as a Service (KaaS) where we use utility mining algorithms for extracting the knowledge data from the data owners when the knowledge consumers are in need of a particular knowledge data. The main motive behind proposing architecture is to ...
متن کاملA Fuzzy Algorithm for Mining High Utility Rare Itemsets – FHURI
Classical frequent itemset mining identifies frequent itemsets in transaction databases using only frequency of item occurrences, without considering utility of items. In many real world situations, utility of itemsets are based upon user’s perspective such as cost, profit or revenue and are of significant importance. Utility mining considers using utility factors in data mining tasks. Utility-...
متن کاملCollusion-Free Privacy Preserving Data Mining
Distributed association rule mining is an integral part of data mining that extracts useful information hidden in distributed data sources. As local frequent itemsets are globalized from data sources, sensitive information about individual data sources needs high protection. Different privacy preserving data mining approaches for distributed environment have been proposed but in the existing ap...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015